Asymptotic properties of constrained Markov Decision Processes

نویسنده

  • Eitan Altman
چکیده

We present in this paper several asymptotic properties of constrained Markov Decision Processes (MDPs) with a countable state space. We treat both the discounted and the expected average cost, with unbounded cost. We are interested in (1) the convergence of nite horizon MDPs to the innnite horizon MDP, (2) convergence of MDPs with a truncated state space to the problem with innnite state space, (3) convergence of MDPs as the discount factor goes to a limit. In all these cases we establish the convergence of optimal values and policies. Moreover, based on the optimal policy for the limiting problem, we construct policies which are almost optimal for the other (approximating) problems.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constrained Markovian dynamics of random graphs

We introduce a statistical mechanics formalism for the study of constrained graph evolution as a Markovian stochastic process, in analogy with that available for spin systems, deriving its basic properties and highlighting the role of the ‘mobility’ (the number of allowed moves for any given graph). As an application of the general theory we analyze the properties of degree-preserving Markov ch...

متن کامل

Accelerated decomposition techniques for large discounted Markov decision processes

Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...

متن کامل

Metrics for Labeled Markov Systems

Partial Labeled Markov Chains are simultaneously generalizations of process algebra and of traditional Markov chains. They provide a foundation for interacting discrete probabilistic systems, the interaction being synchronization on labels as in process algebra. Existing notions of process equivalence are too sensitive to the exact probabilities of various transitions. This paper addresses cont...

متن کامل

Algebraic System Analysis of Timed Petri Nets

We show that Continuous Timed Petri Nets (CTPN) can be modeled by generalized polynomial recurrent equations in the (min,+) semiring. We establish a correspondence between CTPN and Markov decision processes. We survey the basic system theoretical results available: behavioral (inputoutput) properties, algebraic representations, asymptotic regime. A particular attention is paid to the subclass o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • ZOR - Meth. & Mod. of OR

دوره 37  شماره 

صفحات  -

تاریخ انتشار 1993